Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach

نویسندگان

  • Duyu Tang
  • Furu Wei
  • Bing Qin
  • Ming Zhou
  • Ting Liu
چکیده

In this paper, we propose to build large-scale sentiment lexicon from Twitter with a representation learning approach. We cast sentiment lexicon learning as a phrase-level sentiment classification task. The challenges are developing effective feature representation of phrases and obtaining training data with minor manual annotations for building the sentiment classifier. Specifically, we develop a dedicated neural architecture and integrate the sentiment information of text (e.g. sentences or tweets) into its hybrid loss function for learning sentiment-specific phrase embedding (SSPE). The neural network is trained from massive tweets collected with positive and negative emoticons, without any manual annotation. Furthermore, we introduce the Urban Dictionary to expand a small number of sentiment seeds to obtain more training data for building the phrase-level sentiment classifier. We evaluate our sentiment lexicon (TS-Lex) by applying it in a supervised learning framework for Twitter sentiment classification. Experiment results on the benchmark dataset of SemEval 2013 show that, TS-Lex yields better performance than previously introduced sentiment lexicons.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Twitter Sentiment Classification via Multi-Level Sentiment-Enriched Word Embeddings

Most of existing work learn sentiment-specific word representation for improving Twitter sentiment classification, which encoded both n-gram and distant supervised tweet sentiment information in learning process. They assume all words within a tweet have the same sentiment polarity as the whole tweet, which ignores the word its own sentiment polarity. To address this problem, we propose to lear...

متن کامل

A Deep Neural Architecture for Sentence-Level Sentiment Classification in Twitter Social Networking

This paper introduces a novel deep learning framework including a lexicon-based approach for sentencelevel prediction of sentiment label distribution. We propose to first apply semantic rules and then use a Deep Convolutional Neural Network (DeepCNN) for character-level embeddings in order to increase information for word-level embedding. After that, a Bidirectional Long Short-Term Memory netwo...

متن کامل

INESC-ID: A Regression Model for Large Scale Twitter Sentiment Lexicon Induction

We present the approach followed by INESCID in the SemEval 2015 Twitter Sentiment Analysis challenge, subtask E. The goal was to determine the strength of the association of Twitter terms with positive sentiment. Using two labeled lexicons, we trained a regression model to predict the sentiment polarity and intensity of words and phrases. Terms were represented as word embeddings induced in an ...

متن کامل

Creating a General Russian Sentiment Lexicon

The paper describes the new Russian sentiment lexicon RuSentiLex. The lexicon was gathered from several sources: opinionated words from domain-oriented Russian sentiment vocabularies, slang and curse words extracted from Twitter, objective words with positive or negative connotations from a news collection. The words in the lexicon having different sentiment orientations in specific senses are ...

متن کامل

Sentence Modeling with Deep Neural Architecture using Lexicon and Character Attention Mechanism for Sentiment Classification

Tweet-level sentiment classification in Twitter social networking has many challenges: exploiting syntax, semantic, sentiment and context in tweets. To address these problems, we propose a novel approach to sentiment analysis that uses lexicon features for building lexicon embeddings (LexW2Vs) and generates character attention vectors (CharAVs) by using a Deep Convolutional Neural Network (Deep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014